Optimization of Percentage Cube Queries

نویسندگان

Yiqun Zhang

Carlos Ordonez

Javier García-García

Ladjel Bellatreche

چکیده

OLAP cubes are a powerful database technology to join tables and aggregate data to discover interesting trends. However, OLAP cubes exhibit limitations to uncover fractional relationships on a measure aggregated at multiple granularity levels. One prominent example is the percentage, an intuitive probablistic metric, used in practically every analytic application. With such motivation in mind, we introduce the percentage cube, a generalized data cube that takes percentages as the target aggregated measure. Specifically, the percentage cube shows the fractional relationship on a measure in every cuboid between fact table rows grouped by a set of columns (detail individual groups) and their rolled-up aggregation by a subset of those grouping columns (total group). We inroduce minimal query syntax and we carefully study query optimization to compute percentage cubes. It turns out that percentage cubes are significantly more difficult to evaluate than standard data cubes because, in addition to the exponential number of cuboids, there exists a doubly exponential number of grouping column pairs (grouping columns at the individual level and at the total level) on which percentages are computed. Fortunately, it is feasible to prune the search space with a threshold similar to iceberg queries. Experiments on a DBMS compare our novel query optimizations against existing SQL OLAP window functions. Our benchmark results show that our proposed SQL extension is more abstract, more intuitive and faster than existing SQL functions to compute percentages on the cube.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiresolution Cube Estimators for Sensor Network Aggregate Queries

In this work we present in-network techniques to improve the efficiency of spatial aggregate queries. Such queries are very common in a sensornet setting, demanding more targeted techniques for their handling. Our approach constructs and maintains multi-resolution cube hierarchies inside the network, which can be constructed in a distributed fashion. In case of failures, recovery can also be pe...

متن کامل

A Clustered Dwarf Structure to Speed Up Queries on Data Cubes

Dwarf is a highly compressed structure, which compresses the cube by eliminating the semantic redundancies while computing a data cube. Although it has high compression ratio, Dwarf is slower in querying and more difficult in updating due to its structure characteristics. We all know that the original intention of data cube is to speed up the query performance, so we propose two novel clusterin...

متن کامل

Conceptual Object Modeling for OLAP Cubes in a Data Warehousing Environment

Datacubes are efficient structures used to represent multidimensional aggregates at various levels. Quite often multiple datacubes are predefined and computed in order to assist analytical queries in Decision support systems. However being statically defined structures, they suffer from some inherent problems. Firstly, conventional datacubes are highly inefficient when created over sparse data....

متن کامل

Les dépendances fonctionnelles pour la sélection de vues dans les cubes de données

OLAP query processing assumes two seemingly contradictory requirements: on one hand query processing should be fast (thus, queries pre-precomputation) and on another hand queries are assumed to be submitted in an ad hoc manner making workload usage for optimization sometimes not effective (we cannot materialize an infinite number of queries). Thus, in this paper we address the specific case of ...

متن کامل

Summarizing Datacubes: Semantic and Syntactic Approaches

Datacubes are especially useful for answering efficiently queries on data warehouses. Nevertheless the amount of generated aggregated data is huge with respect to the initial data which is itself very large. Recent research work has addressed the issue of summarizing Datacubes in order to reduce their size. In this chapter, we present three different approaches. They propose structures which ma...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Optimization of Percentage Cube Queries

نویسندگان

چکیده

منابع مشابه

Multiresolution Cube Estimators for Sensor Network Aggregate Queries

A Clustered Dwarf Structure to Speed Up Queries on Data Cubes

Conceptual Object Modeling for OLAP Cubes in a Data Warehousing Environment

Les dépendances fonctionnelles pour la sélection de vues dans les cubes de données

Summarizing Datacubes: Semantic and Syntactic Approaches

عنوان ژورنال:

اشتراک گذاری